7 research outputs found

    Real-time video-plus-depth content creation utilizing time-of-flight sensor - from capture to display

    Get PDF
    Recent developments in 3D camera technologies, display technologies and other related fields have been aiming to provide 3D experience for home user and establish services such as Three-Dimensional Television (3DTV) and Free-Viewpoint Television (FTV). Emerging multiview autostereoscopic displays do not require any eyewear and can be watched by multiple users at the same time, thus are very attractive for home environment usage. To provide a natural 3D impression, autostereoscopic 3D displays have been design to synthesize multi-perspective virtual views of a scene using Depth-Image-Based Rendering (DIBR) techniques. One key issue of DIBR is that scene depth information in a form of a depth map is required in order to synthesize virtual views. Acquiring this information is quite complex and challenging task and still an active research topic. In this thesis, the problem of dynamic 3D video content creation of real-world visual scenes is addressed. The work assumed data acquisition setting including Time-of-Flight (ToF) depth sensor and a single conventional video camera. The main objective of the work is to develop efficient algorithms for the stages of synchronous data acquisition, color and ToF data fusion, and final view-plus-depth frame formatting and rendering. The outcome of this thesis is a prototype 3DTV system capable for rendering live 3D video on a 3D autostereoscopic display. The presented system makes extensive use of the processing capabilities of modern Graphics Processing Units (GPUs) in order to achieve real-time processing rates while providing an acceptable visual quality. Furthermore, the issue of arbitrary view synthesis is investigated in the context of DIBR and a novel approach based on depth layering is proposed. The proposed approach is applicable for general virtual views synthesis, i.e. in terms of different camera parameters such as position, orientation, focal length and varying sensors spatial resolutions. The experimental results demonstrate real-time capability of the proposed method even for CPU-based implementations. It compares favorably to other view synthesis methods in terms of visual quality, while being more computationally efficient

    Fast and Accurate Depth Estimation from Sparse Light Fields

    Get PDF
    We present a fast and accurate method for dense depth reconstruction from sparsely sampled light fields obtained using a synchronized camera array. In our method, the source images are over-segmented into non-overlapping compact superpixels that are used as basic data units for depth estimation and refinement. Superpixel representation provides a desirable reduction in the computational cost while preserving the image geometry with respect to the object contours. Each superpixel is modeled as a plane in the image space, allowing depth values to vary smoothly within the superpixel area. Initial depth maps, which are obtained by plane sweeping, are iteratively refined by propagating good correspondences within an image. To ensure the fast convergence of the iterative optimization process, we employ a highly parallel propagation scheme that operates on all the superpixels of all the images at once, making full use of the parallel graphics hardware. A few optimization iterations of the energy function incorporating superpixel-wise smoothness and geometric consistency constraints allows to recover depth with high accuracy in textured and textureless regions as well as areas with occlusions, producing dense globally consistent depth maps. We demonstrate that while the depth reconstruction takes about a second per full high-definition view, the accuracy of the obtained depth maps is comparable with the state-of-the-art results.Comment: 15 pages, 15 figure

    Real-time video-plus-depth content creation utilizing time-of-flight sensor - from capture to display

    Get PDF
    Recent developments in 3D camera technologies, display technologies and other related fields have been aiming to provide 3D experience for home user and establish services such as Three-Dimensional Television (3DTV) and Free-Viewpoint Television (FTV). Emerging multiview autostereoscopic displays do not require any eyewear and can be watched by multiple users at the same time, thus are very attractive for home environment usage. To provide a natural 3D impression, autostereoscopic 3D displays have been design to synthesize multi-perspective virtual views of a scene using Depth-Image-Based Rendering (DIBR) techniques. One key issue of DIBR is that scene depth information in a form of a depth map is required in order to synthesize virtual views. Acquiring this information is quite complex and challenging task and still an active research topic. In this thesis, the problem of dynamic 3D video content creation of real-world visual scenes is addressed. The work assumed data acquisition setting including Time-of-Flight (ToF) depth sensor and a single conventional video camera. The main objective of the work is to develop efficient algorithms for the stages of synchronous data acquisition, color and ToF data fusion, and final view-plus-depth frame formatting and rendering. The outcome of this thesis is a prototype 3DTV system capable for rendering live 3D video on a 3D autostereoscopic display. The presented system makes extensive use of the processing capabilities of modern Graphics Processing Units (GPUs) in order to achieve real-time processing rates while providing an acceptable visual quality. Furthermore, the issue of arbitrary view synthesis is investigated in the context of DIBR and a novel approach based on depth layering is proposed. The proposed approach is applicable for general virtual views synthesis, i.e. in terms of different camera parameters such as position, orientation, focal length and varying sensors spatial resolutions. The experimental results demonstrate real-time capability of the proposed method even for CPU-based implementations. It compares favorably to other view synthesis methods in terms of visual quality, while being more computationally efficient

    Content-Adaptive Superpixel Segmentation Via Image Transformation

    Get PDF
    We propose simple and efficient method that produces content-adaptive superpixels, i.e. smaller segments in content-dense areas and larger segments in content-sparse areas. Previous adaptive methods distribute superpixels over the image according to image content. In contrast, we transform the image itself to redistribute the content density uniformly across the image area. This transformation is guided by a significance map, which characterizes the ‘importance’ of each pixel. Arbitrary superpixel algorithm can be utilized to segment the transformed image into regular superpixels, providing a suitable representation for subsequent tasks. Regular superpixels in the transformed image induce content-adaptive superpixels in the original image facilitating the improved segmentation accuracy.acceptedVersionPeer reviewe

    CPU-efficient free view synthesis based on depth layering

    No full text
    acceptedVersionPeer reviewe

    A Speed-optimized RGB-Z capture system with improved denoising capabilities

    Get PDF
    We have developed an end-to-end system for 3D scene sensing which combines a conventional high-resolution RGB camera with a low-resolution Time-of-Flight (ToF) range sensor. The system comprises modules for range data denoising, data re-projection and non-uniform to uniform up-sampling and aims at composing high-resolution 3D video output for driving auto-stereoscopic 3D displays in real-time. In our approach, the ToF sensor is set to work with short integration time with the aim to increase the capture speed and decrease the amount of motion artifacts. However, reduced integration time leads to noisy range images. We specifically address the noise reduction problem by performing a modification of the non-local means filtering in spatio-temporal domain. Time-consecutive range images are utilized not only for efficient de-noising but also for accurate non-uniform to uniform up-sampling on the high-resolution RGB grid. Use is made of the reflectance signal of the ToF sensor for providing a confidence-type of feedback to the denosing module where a new adaptive averaging is proposed to effectively handle motion artifacts. As of the non-uniform to uniform resampling of range data is concerned, we have developed two alternative solutions; one relying entirely on the GPU power and another being applicable to any general platform. The latter method employs an intermediate virtual range camera recentering after with the resamploing process degrades to a 2D interpolation performed within the lowresolution grid. We demonstrate a real-time performance of the system working in low-power regime.acceptedVersionPeer reviewe
    corecore